Search CORE

13 research outputs found

Learning Deep SPD Visual Representation for Image Classification

Author: Rahman Saimunur
Publication venue: School of Computing and Information Technology
Publication date: 01/01/2022
Field of study

Symmetric positive definite (SPD) visual representations are effective due to their ability to capture high-order statistics to describe images. Reliable and efficient calculation of SPD matrix representation from small sized feature maps with a high number of channels in CNN is a challenging issue. This thesis presents three novel methods to address the above challenge. The first method, called Relation Dropout (ReDro), is inspired by the fact that eigen-decomposition of a block diagonal matrix can be efficiently obtained by eigendecomposition of each block separately. Thus, instead of using a full covariance matrix as in the literature, this thesis randomly group the channels and form a covariance matrix per group. ReDro is inserted as an additional layer preceding the matrix normalisation step and the random grouping is made transparent to all subsequent layers. ReDro can be seen as a dropout-related regularisation which discards some pair-wise channel relationships across each group. The second method, called FastCOV, exploits the intrinsic connection between eigensytems of XXT and XTX. Specifically, it computes position-wise covariance matrix upon convolutional feature maps instead of the typical channel-wise covariance matrix. As the spatial size of feature maps is usually much smaller than the channel number, conducting eigen-decomposition of the position-wise covariance matrix avoids rank-deficiency and it is faster than the decomposition of the channel-wise covariance matrix. The eigenvalues and eigenvectors of the normalised channel-wise covariance matrix can be retrieved by the connection of the XXT and XTX eigen-systems. The third method, iSICE, deals with the reliable covariance estimation from small sized and highdimensional CNN feature maps. It exploits the prior structure of the covariance matrix to estimate sparse inverse covariance which is developed in the literature to deal with the covariance matrix’s small sample issue. Given a covariance matrix, this thesis iteratively minimises its log-likelihood penalised by a sparsity with gradient descend. The resultant representation characterises partial correlation instead of indirect correlation characterised in covariance representation. As experimentally demonstrated, all three proposed methods improve the image classification performance, whereas the first two proposed methods reduce the computational cost of learning large SPD visual representations

Research Online

Deep Learning based HEp-2 Image Classification: A Comprehensive Review

Author: Rahman Saimunur
Sun Changming
Wang Lei
Zhou Luping
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Classification of HEp-2 cell patterns plays a significant role in the indirect immunofluorescence test for identifying autoimmune diseases in the human body. Many automatic HEp-2 cell classification methods have been proposed in recent years, amongst which deep learning based methods have shown impressive performance. This paper provides a comprehensive review of the existing deep learning based HEp-2 cell image classification methods. These methods perform HEp-2 image classification at two levels, namely, cell-level and specimen-level. Both levels are covered in this review. At each level, the methods are organized with a deep network usage based taxonomy. The core idea, notable achievements, and key strengths and weaknesses of each method are critically analyzed. Furthermore, a concise review of the existing HEp-2 datasets that are commonly used in the literature is given. The paper ends with a discussion on novel opportunities and future research directions in this field. It is hoped that this paper would provide readers with a thorough reference of this novel, challenging, and thriving field.Comment: Published in Medical Image Analysi

arXiv.org e-Print Archive

Research Online

Learning Partial Correlation based Deep Visual Representation for Image Classification

Author: Koniusz Piotr
Moghadam Peyman
Rahman Saimunur
Sun Changming
Wang Lei
Zhou Luping
Publication venue
Publication date: 26/04/2023
Field of study

Visual representation based on covariance matrix has demonstrates its efficacy for image classification by characterising the pairwise correlation of different channels in convolutional feature maps. However, pairwise correlation will become misleading once there is another channel correlating with both channels of interest, resulting in the ``confounding'' effect. For this case, ``partial correlation'' which removes the confounding effect shall be estimated instead. Nevertheless, reliably estimating partial correlation requires to solve a symmetric positive definite matrix optimisation, known as sparse inverse covariance estimation (SICE). How to incorporate this process into CNN remains an open issue. In this work, we formulate SICE as a novel structured layer of CNN. To ensure end-to-end trainability, we develop an iterative method to solve the above matrix optimisation during forward and backward propagation steps. Our work obtains a partial correlation based deep visual representation and mitigates the small sample problem often encountered by covariance matrix estimation in CNN. Computationally, our model can be effectively trained with GPU and works well with a large number of channels of advanced CNNs. Experiments show the efficacy and superior classification performance of our deep visual representation compared to covariance matrix based counterparts.Comment: This paper is published at CVPR 202

arXiv.org e-Print Archive

Spatio-temporal mid-level feature bank for action recognition in low quality video

Author: Rahman Saimunur
See John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/03/2016
Field of study

It is a great challenge to perform high level recognition tasks on videos that are poor in quality. In this paper, we propose a new spatio-temporal mid-level (STEM) feature bank for recognizing human actions in low quality videos. The feature bank comprises of a trio of local spatio-temporal features, i.e. shape, motion and textures, which respectively encode structural, dynamic and statistical information in video. These features are encoded into mid-level representations and aggregated to construct STEM. Based on the recent binarized statistical image feature (BSIF), we also design a new spatiotemporal textural feature that extracts discriminately from 3D salient patches. Extensive experiments on the poor quality versions/subsets of the KTH and HMDB51 datasets demonstrate the effectiveness of the proposed approach

Heriot Watt Pure

SHDL@MMU Digital Repository

On the Effects of Low Video Quality in Human Action Recognition

Author: Rahman Saimunur
See John Su Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Human activity recognition is one of the most intensively studied areas of computer vision and pattern recognition in recent years. A wide variety of approaches have shown to work well against challenging image variations such as appearance, pose and illumination. However, the problem of low video quality remains an unexplored and challenging issue in real-world applications. In this paper, we investigate the effects of low video quality in human action recognition from two perspectives: videos that are poorly sampled spatially (low resolution) and temporally (low frame rate), and compressed videos affected by motion blurring and artifacts. In order to increase the robustness of feature representation under these conditions, we propose the usage of textural features to complement the popular shape and motion features. Extensive experiments were carried out on two well-known benchmark datasets of contrasting nature: the classic KTH dataset and the large-scale HMDB51 dataset. Results obtained with two popular representation schemes (Bag-of-Words, Fisher Vectors) further validate the effectiveness of the proposed approach

Heriot Watt Pure

Crossref

SHDL@MMU Digital Repository

Deep CNN object features for improved action recognition in low quality videos

Author: Ho Chiung Ching
Rahman Saimunur
See John
Publication venue: 'American Scientific Publishers'
Publication date: 01/11/2017
Field of study

Heriot Watt Pure

Exploiting textures for better action recognition in low-quality videos

Author: Chiung Ching Ho
John See
Saimunur Rahman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Abstract Human action recognition is an increasingly matured field of study in the recent years, owing to its widespread use in various applications. A number of related research problems, such as feature representations, human pose and body parts detection, and scene/object context, are being actively studied. However, the general problem of video quality—a realistic issue in the face of low-cost surveillance infrastructure and mobile devices, has not been systematically investigated from various aspects. In this paper, we address the problem of action recognition in low-quality videos from a myriad of perspectives: spatial and temporal downsampling, video compression, and the presence of motion blurring and compression artifacts. To increase the resilience of feature representation in these type of videos, we propose to use textural features to complement classical shape and motion features. Extensive results were carried out on low-quality versions of three publicly available datasets: KTH, UCF-YouTube, HMDB. Experimental results and analysis suggest that leveraging textural features can significantly improve action recognition performance under low video quality conditions

Heriot Watt Pure

Directory of Open Access Journals

SHDL@MMU Digital Repository

Deep learning based HEp-2 image classification: A comprehensive review

Author: Rahman Saimunur
Sun Changming
Wang Lei
Zhou Luping
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2020
Field of study

© 2020 Classification of HEp-2 cell patterns plays a significant role in the indirect immunofluorescence test for identifying autoimmune diseases in the human body. Many automatic HEp-2 cell classification methods have been proposed in recent years, amongst which deep learning based methods have shown impressive performance. This paper provides a comprehensive review of the existing deep learning based HEp-2 cell image classification methods. These methods perform HEp-2 image classification at two levels, namely, cell-level and specimen-level. Both levels are covered in this review. At each level, the methods are organized with a deep network usage based taxonomy. The core idea, notable achievements, and key strengths and weaknesses of each method are critically analyzed. Furthermore, a concise review of the existing HEp-2 datasets that are commonly used in the literature is given. The paper ends with a discussion on novel opportunities and future research directions in this field. It is hoped that this paper would provide readers with a thorough reference of this novel, challenging, and thriving field

Research Online

Action recognition in low quality videos by jointly using shape, motion and texture features

Author: Ho Chiung Ching
Rahman Saimunur
See John Su Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2016
Field of study

Shape, motion and texture features have recently gained much popularity in their use for human action recognition. While many of these descriptors have been shown to work well against challenging variations such as appearance, pose and illumination, the problem of low video quality is relatively unexplored. In this paper, we propose a new idea of jointly employing these three features within a standard bag-of-features framework to recognize actions in low quality videos. The performance of these features were extensively evaluated and analyzed under three spatial downsampling and three temporal downsampling modes. Experiments conducted on the KTH and Weizmann datasets with several combination of features and settings showed the importance of all three features (HOG, HOF, LBP-TOP), and how low quality videos can benefit from the robustness of textural features

CiteSeerX

Heriot Watt Pure

SHDL@MMU Digital Repository

Leveraging textural features for recognizing actions in low quality videos

Author: Ho Chiung Ching
Rahman Saimunur
See John
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2016
Field of study

Human action recognition is a well researched problem, which is considerably more challenging when video quality is poor. In this paper, we investigate human action recognition in low quality videos by leveraging the robustness of textural features to better characterize actions, instead of relying on shape and motion features may fail under noisy conditions. To accommodate videos, texture descriptors are extended to three orthogonal planes (TOP) to extract spatio-temporal features. Extensive experiments were conducted on low quality versions of the KTH and HMDB51 datasets to evaluate the performance of our proposed approaches against standard baselines. Experimental results and further analysis demonstrated the usefulness of textural features in improving the capability of recognizing human actions from low quality videos

Heriot Watt Pure

SHDL@MMU Digital Repository